A Workload-Adaptive Streaming Partitioner for Distributed Graph Stores

نویسندگان

چکیده

Abstract Streaming graph partitioning methods have recently gained attention due to their ability scale very large graphs with limited resources. However, many such do not consider workload and characteristics. This may degrade the performance of queries by increasing inter-node communication computational load imbalance. Moreover, existing workload-aware cannot consistently provide good as they dynamic workloads that keep emerging in applications. We address these issues proposing a novel workload-adaptive streaming partitioner named WASP, aims achieve low-latency high-throughput online queries. As each typically contains frequent query patterns, WASP exploits capture active vertices edges which are frequently visited traversed, respectively. information is used heuristically improve quality partitions either avoiding concentration few proportional visit frequencies or reducing probability cut traversal frequencies. In order assess impact on store show how easily approach can be plugged top system, we exploit it distributed graph-based RDF store. Our experiments over three synthetic real-world datasets corresponding static achieves better against state-of-the-art partitioners, especially workloads.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Scalable Distributed Graph Partitioner

We present Scalable Host-tree Embeddings for Efficient Partitioning (Sheep), a distributed graph partitioning algorithm capable of handling graphs that far exceed main memory. Sheep produces high quality edge partitions an order of magnitude faster than both state of the art offline (e.g., METIS) and streaming partitioners (e.g., Fennel). Sheep’s partitions are independent of the input graph di...

متن کامل

Workload-aware Streaming Graph Partitioning

Partitioning large graphs, in order to balance storage and processing costs across multiple physical machines, is becoming increasingly necessary as the typical scale of graph data continues to increase. A partitioning, however, may introduce query processing latency due to inter-partition communication overhead, especially if the query workload exhibits skew, frequently traversing a limited su...

متن کامل

GraSP: Distributed Streaming Graph Partitioning

This paper presents a distributed, streaming graph partitioner, Graph Streaming Partitioner (GraSP), which makes partition decisions as each vertex is read from memory, simulating an online algorithm that must process nodes as they arrive. GraSP is a lightweight high-performance computing (HPC) library implemented in MPI, designed to be easily substituted for existing HPC partitioners such as P...

متن کامل

Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner

We design and implement a distributed algorithm for balanced k-way hypergraph partitioning that minimizes fanout, a fundamental hypergraph quantity also known as the communication volume and (k − 1)-cut metric, by optimizing a novel objective called probabilistic fanout. This choice allows a simple local search heuristic to achieve comparable solution quality to the best existing hypergraph par...

متن کامل

Adaptive Algorithms for Managing a Distributed Data Processing Workload

Workload management, a function of the OSf390" operating system base control program, allows installations to define business objectives for a clustered environment (Parallel SysplexTM in OSl390). This business policy is expressed in terms that relate to business goals and importance, rather than the internal controls used by the operating system. OSf390 ensures that system resources are assign...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Data Science and Engineering

سال: 2021

ISSN: ['2364-1541', '2364-1185']

DOI: https://doi.org/10.1007/s41019-021-00156-2